Pattern Discovery as Event Association
نویسندگان
چکیده
A basic task of machine learning and data mining is to automatically uncover patterns that reflect regularities in a data set. When dealing with a large database, especially when domain knowledge is not available or very weak, this can be a challenging task. The purpose of pattern discovery is to find non-random relations among events from data sets. For example, the “exclusive OR” (XOR) problem concerns 3 binary variables, A, B and C=A⊗B, i.e. C is true when either A or B, but not both, is true. Suppose not knowing that it is the XOR problem, we would like to check whether or not the occurrence of the compound event [A=T, B=T, C=F] is just a random happening. If we could estimate its frequency of occurrences under the random assumption, then we know that it is not random if the observed frequency deviates significantly from that assumption. We refer to such a compound event as an event association pattern, or simply a pattern, if its frequency of occurrences significantly deviates from the default random assumption in the statistical sense. For instance, suppose that an XOR database contains 1000 samples and each primary event (e.g. [A=T]) occurs 500 times. The expected frequency of occurrences of the compound event [A=T, B=T, C=F] under the independence assumption is 0.5×0.5×0.5×1000 = 125. Suppose that its observed frequency is 250, we would like to see whether or not the difference between the observed and expected frequencies (i.e. 250 – 125) is significant enough to indicate that the compound event is not a random happening. Andrew K. C. Wong University of Waterloo, Canada
منابع مشابه
Discovery of Frequent Episodes in Event Logs
Lion’s share of process mining research focuses on the discovery of end-to-end process models describing the characteristic behavior of observed cases. The notion of a process instance (i.e., the case) plays an important role in process mining. Pattern mining techniques (such as frequent itemset mining, association rule learning, sequence mining, and traditional episode mining) do not consider ...
متن کاملTemporal Sequence Associations for Rare Events1
In many real world applications, systematic analysis of rare events, such as credit card frauds and adverse drug reactions, is very important. Their low occurrence rate in large databases often makes it difficult to identify the risk factors from straightforward application of associations and sequential pattern discovery. In this paper we introduce a heuristic to guide the search for interesti...
متن کاملIL-Miner: Instance-Level Discovery of Complex Event Patterns
Complex event processing (CEP) matches patterns over a continuous stream of events to detect situations of interest. Yet, the definition of an event pattern that precisely characterises a particular situation is challenging: there are manifold dimensions to correlate events, including time windows and value predicates. In the presence of historic event data that is labelled with the situation t...
متن کاملContinuous and incremental data mining association rules using frame metadata model
Most organizations have large databases that contain a wealth of potentially accessible information. The unlimited growth of data will inevitably lead to a situation in which it is increasingly difficult to access the desired information. There is a need to extract knowledge from data by Knowledge Discovery in Database. Data mining is the discovery stage of KDD whereas association rule is a pos...
متن کاملFrom Association to Classification: Inference Using Weight of Evidence
Association and classification are two important tasks in data mining and knowledge discovery. Intensive studies have been carried out in both areas. But, how to apply discovered event associations to classification is still seldom found in current publications. Trying to bridge this gap, this paper extends our previous paper on significant event association discovery to classification. We prop...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009